translated by 谷歌翻译
State-of-the-art language models are often accurate on many question-answering benchmarks with well-defined questions. Yet, in real settings questions are often unanswerable without asking the user for clarifying information. We show that current SotA models often do not ask the user for clarification when presented with imprecise questions and instead provide incorrect answers or "hallucinate". To address this, we introduce CLAM, a framework that first uses the model to detect ambiguous questions, and if an ambiguous question is detected, prompts the model to ask the user for clarification. Furthermore, we show how to construct a scalable and cost-effective automatic evaluation protocol using an oracle language model with privileged information to provide clarifying information. We show that our method achieves a 20.15 percentage point accuracy improvement over SotA on a novel ambiguous question-answering answering data set derived from TriviaQA.
translated by 谷歌翻译
Compartmental models are a tool commonly used in epidemiology for the mathematical modelling of the spread of infectious diseases, with their most popular representative being the Susceptible-Infected-Removed (SIR) model and its derivatives. However, current SIR models are bounded in their capabilities to model government policies in the form of non-pharmaceutical interventions (NPIs) and weather effects and offer limited predictive power. More capable alternatives such as agent based models (ABMs) are computationally expensive and require specialized hardware. We introduce a neural network augmented SIR model that can be run on commodity hardware, takes NPIs and weather effects into account and offers improved predictive power as well as counterfactual analysis capabilities. We demonstrate our models improvement of the state-of-the-art modeling COVID-19 in Austria during the 03.2020 to 03.2021 period and provide an outlook for the future up to 01.2024.
translated by 谷歌翻译
Convolutional neural networks (CNN) define the state-of-the-art solution on many perceptual tasks. However, current CNN approaches largely remain vulnerable against adversarial perturbations of the input that have been crafted specifically to fool the system while being quasi-imperceptible to the human eye. In recent years, various approaches have been proposed to defend CNNs against such attacks, for example by model hardening or by adding explicit defence mechanisms. Thereby, a small "detector" is included in the network and trained on the binary classification task of distinguishing genuine data from data containing adversarial perturbations. In this work, we propose a simple and light-weight detector, which leverages recent findings on the relation between networks' local intrinsic dimensionality (LID) and adversarial attacks. Based on a re-interpretation of the LID measure and several simple adaptations, we surpass the state-of-the-art on adversarial detection by a significant margin and reach almost perfect results in terms of F1-score for several networks and datasets. Sources available at: https://github.com/adverML/multiLID
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译
收集和注释面向任务的对话框数据很困难,尤其是对于需要专家知识的高度特定领域。同时,非正式的沟通渠道(例如即时使者)在工作中越来越多地使用。这导致了许多与工作相关的信息,这些信息通过这些渠道传播,需要由员工进行后处理。为了减轻这个问题,我们提出了TexPrax,这是一种消息传递系统,以收集和注释与工作有关的聊天中发生的问题,原因和解决方案。 TexPrax使用聊天机器人直接吸引员工,以提供对话的轻量级注释并简化文档工作。为了遵守数据隐私和安全法规,我们使用端到端消息加密,并使用户完全控制其数据,该数据比常规注释工具具有各种优势。我们与德国工厂员工一起在用户研究中评估TexPrax,他们要求同事提供有关日常工作中出现的问题的解决方案。总体而言,我们收集201个面向任务的德语对话,其中包含1,027个句子,并带有句子级专家注释。我们的数据分析还表明,现实世界对话经常包含具有代码转换,对同一实体的缩写的实例,以及NLP系统应该能够处理的方言。
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译
translated by 谷歌翻译